N94- 35559 


■ 5 / 7 - 4 / 


TDA Progress Report 42-117 


\ - 

J 


o 


May 15, 1994 


The Development and Application of Composite Complexity 
Models and a Relative Complexity Metric in a 
Software Maintenance Environment 

J. M. Hops 

Radio Frequency and Microwave Subsystems Section 
J. S. Sherif 

Software Product Assurance Section 
and 

California State University, Fullerton 


A great deal of effort is now being devoted to the study, analysis, prediction , 
and minimization of software maintenance expected cost, long before software is 
delivered to users or customers. It has been estimated that, on the average, the 
effort spent on software maintenance is as costly as the effort spent on all other 
software costs. Software design methods should be the starting point to aid in al- 
leviating the problems of software maintenance complexity and high costs. Two 
aspects of maintenance deserve attention: (1) protocols for locating and rectifying 
defects , and for ensuring that no new defects are introduced in the development 
phase of the software process, and (2) protocols for modification, enhancement, and 
upgrading. This article focuses primarily on the second aspect, the development of 
protocols to help increase the quality and reduce the costs associated with modi- 
fications, enhancements , and upgrades of existing software. This study developed 
parsimonious models and a relative complexity metric for complexity measurement 
of software that were used to rank the modules in the system reiative to one an- 
other. Some success was achieved in using the models and the reiative metric to 
identify maintenance-prone modules. 

I. Introduction 

A. Project Objectives 

The primary objective of this study was to determine 
whether software metrics could help guide our efforts in 
the development and maintenance of the real-time embed- 
ded systems that we develop for NASA’s Deep Space Net- 


work (DSN). Generally, the systems that are developed 
control receivers, transmitters, exciters, and signal paths 
through the communication hardware. The most common 
programming language in our systems is PL/M for Intel 
8080, 8086, and 80286 microprocessors; and the systems 
range in size from 20,000 to 100,000 non-commented lines 
of code (NCLOC). Approximately 65 percent of the fund- 
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Fig. 13. Carrier loop received signal-to-nolse spectral density Fig. 14. Comparison of theoretical and simulated tracking phase 
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ing received in our environment is dedicated to extending 
the life span of the previously developed systems; of this, 
15 percent is spent on finding and fixing defects, while 
85 percent is spent on adding automation features, adding 
capabilities, and increasing capacity. 

Our efforts have been successful in that the life spans of 
our systems now range from 4 to 8 years and are increas- 
ing. As support for new spacecraft becomes necessary, 
these older systems are being used in new ways, thereby 
increasing the importance of high-quality, defect-free, and 
cost-effective enhancements to the software. Protocols and 
guidance for locating and rectifying defects in the software- 
sustaining environment were deemed critical, especially 
with the added complications that the maintainers of the 
systems are not the original developers and that there is 
little or no confidence in the software documentation. 

Specifically, we were looking for ways to identify which 
modules should be reengineered and which modules would 
need extra development and test time in order to main- 
tain. The problems we face in our environment are quite 
common in the industry. Software maintenance cost is 
about two to four times the original development cost 
[3,13,10,21]. Charette [5] emphasizes the fact that 60 to 
80 percent of the total software costs are related to main- 
tenance. This will likely remain so for the indefinite future 
[7,11,24]. 

Figure 1 shows the initial cost breakdown in develop- 
ing a new project (unfortunately with maintenance costs 
hidden), and Fig. 2 shows the costs of software during its 
life cycle, as discussed by Zelkowitz [34]. Software mainte- 
nance is not what people think it is: Software maintenance 
actually encompasses fixing software errors in addition to 
software enhancements and adding new functions to exist- 
ing systems, system conversion, training and supporting 
users, and improving system performance [31-33], Error 
correction, which is often perceived as the substance of 
maintenance, is only a small part of the software main- 
tenance effort [8,4]. Table 1 shows the distribution of 
the average time spent on various maintenance tasks for 
4 years, as reported by Lientz and Swanson [19]. Note that 
functional enhancement constitutes the major portion of 
the time spent on software maintenance. Charette [5] dis- 
cusses another reason why the cost of software is so high 
and cites some statistics as reported by the Comptroller 
General [6] and as shown in Table 2. It is reported that 
only 2 percent of the software contracted for could work on 
delivery; 3 percent could work after some rework; 45 per- 
cent was delivered, but was never successfully put to use; 
20 percent was used, but was either extensively reworked 


or abandoned; and 30 percent was paid for, but was never 
delivered. 

For the study described in this article, we took the fol- 
lowing steps: 

(1) Determined what the literature suggests. 

(2) Developed a course of action to be tried on one of our 
operational systems that would be representative of 
all the others. 

(3) Performed the steps and analyzed the results. 

The process and results of each of these steps are de- 
scribed below. 

B. Suggestions from the Literature and Course 
of Action 

One of the earlier studies encountered pertaining to 
our objectives was undertaken by Shen, Yu, Thebaut, and 
Paulsen [27]. They assessed the potential usefulness of 
product and process metrics in identifying components of 
the system that were most likely to contain errors. Their 
goal was to establish an empirical basis for the use of ob- 
jective criteria in developing strategies for the allocation 
of testing effort in the software-maintenance environment. 
It was found that the number of unique operands, as de- 
fined by Halstead [14], was the best predictor of problem 
reports on modules that were reported after the initial 
delivery. Additionally, simple metrics related to the num- 
ber of unique operands, such as the cyclomatic complexity 
(defined by McCabe [20]), also performed well. Shen et al. 
concluded that these metrics are useful in finding error- 
prone modules at an early stage [27]. 

In 1987, Kafura and Reddy [17] published the results 
of their study on using software complexity metrics during 
the software maintenance phase of a system. They related 
seven separate metrics to the experience of maintenance 
activities on medium-sized systems. Two of the results re- 
ported were that the overall complexity of a system grows 
with time and that the individual complexity scores of the 
software modules agree well with the expert opinions of 
the programmers. Their conclusion was that metrics could 
form the control element in a formal maintenance method. 

Harrison and Cook [15,16] discuss the decision, fre- 
quently encountered by software maintenance personnel, 
of whether to make an isolated change in a module or 
to totally redesign and rewrite the module anew. They 
developed an objective decision rule to identify modules 
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that should be rewritten rather than modified. This de- 
cision rule is whether the total change in the Halstead 
software science volume metric exceeds a threshold value. 
This threshold value seems to be subjective since it de- 
pends upon the decision maker's risk-taking propensity 
and experience and since it must be tuned for a partic- 
ular environment. 

Lennselius, Wohlin, and Vrana [18] discuss the possi- 
bility of using complexity metrics to identify error-prone, 
and thus maintenance-prone, modules. They suggest that 
a module whose complexity lies at least one standard de- 
viation above the acceptable mean of complexity of the 
project may be considered to be a maintenance-prone 
module. The authors, however, emphasize that metrics 
cannot replace the decision-making process of software 
managers. 

Rodriguez and Tsai [23] use discriminant analysis to de- 
velop a methodology to evaluate software metrics. They 
suggest that when classifying units of software as either 
complex or normal, more attention is usually paid to the 
complex group to either redesign it or test it more thor- 
oughly. Their methodology is based on the assumption of 
normal distribution and homogeneity of variances of the 
two groups. The authors consider 13 metrics depicting 
Halstead’s software science metrics, McCabe complexity 
metrics, and NCLOC metrics. They conclude that these 
metrics are correlated. 

Stalhane [29] discusses how to estimate the number of 
defects in a software unit from various software metrics 
and how to estimate the reliability of the same software. 
The author also concludes that complexity increases as 
the size of code increases. Stalhane asserts that misunder- 
standing the specifications will increase with the specifica- 
tion complexity and that complexity may be transferred 
to the code and thus lead to maintenance-prone complex 
code and complex modules. 

Munson and Khoshgoftaar [21] employ factor analytic 
techniques to reduce the dimensionality of the complexity 
problem space to produce a set of reduced metrics. The re- 
duced complexity metrics are subsequently combined into 
a single relative complexity measure for the purpose of 
comparing and classifying programs. In particular, the 
relative complexity metric can be seen to represent the 
complexity of a particular software module at a particular 
level of system release. The authors investigate McCabe 
complexity metrics, Halstead software science metrics, and 
NCLOC metrics. The comparison of complexity is again 
of a relative and subjective nature. 


Binder and Poore [2] investigate the possibility of in- 
cluding the number of comments in the code as a variable 
in determining the quality of the code. They assert that 
comments only contribute to quality when they are needed 
and meaningful. The authors suggest a software quality 
measure called the “LB-ratio,” which is defined as the ra- 
tio of the number of operators to the sum of the number of 
operands and the number of comments. The authors agree 
that their experiments with the LB-ratio need additional 
work and refinement since including the concept of mean- 
ingful comments in the formula seems to be problematic 
and subjective at best. 

The following suggestions were deduced from these 
sources: 

(1) An estimate of errors and reliability can be deter- 
mined from software product metrics [20,27,29]. 

(2) Software product metrics could be used to find error- 
prone modules and could form the control element in 
a formal software maintenance methodology [15-18]. 

(3) The software product metrics that may be consid- 
ered include all of Halstead’s software science met- 
rics, McCabe’s complexity metric [14,23,27], and 
NCLOC [21]. 

(4) Factor analysis can be used to identify those software 
measures that are highly and significantly related to 
all other measures. This economy of description will 
facilitate the analysis of software complexity [21]. 

(5) Comments in the code contribute to the quality of 
software [2]. 

We therefore took the following actions: 

(1) Determined the Halstead software science, McCabe 
complexity, NCLOC, and LB-ratio from sequential 
releases of a representative software system. 

(2) Performed factor analysis on the metrics from the 
software modules to determine the unique dimen- 
sions represented by the metrics. 

(3) Proposed a model to calculate a relative metric. 

(4) Determined if this metric can identify maintenance- 
prone modules in the software by using the mean- 
plus-one standard deviation as the relative metric 
cut-off value. 
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II. Method, Analysis, and Results 

A. Representative System and Metrics Collection 

1. Nature of Software. We analyzed the source 
program in the very long baseline interferometry (VLBI) 
receiver controller (VRC) software system by using factor 
analysis for 16 software measures. The source program is 
a real-time embedded system in the receiver-exciter sub- 
system of NASA’s DSN. It serves as a communication in- 
terface to VLBI subsystems and configures and monitors 
the status of the narrow-channel bandwidth VLBI receiver 
assembly. Three releases of the system software were an- 
alyzed: OP-B (222 modules), OP-C (224 modules), and a 
draft version of OP-D (235 modules). These were used as 
a representative maintenance project in this study. The 
source code for these three releases was originally written 
in PL/M but was later converted to C using the PLC86 
conversion program (from Micro-Processor Services). 

2. Software Metrics and Measures. Software met- 
rics are quantitative measures of certain characteristics of 
a development project that can be valuable management 
and engineering tools. Software metrics can be used to 
achieve various project-specific results, such as predicting 
source-code complexity at the design phase; monitoring 
and controlling software reliability and functionality; pre- 
dicting cost and schedule; and identifying high-risk mod- 
ules in a software project [28]. 

The 16 software measures that were used to analyze 
the VRC software are given in Table 3. The first eight 
measures belong to the Halstead software science family of 
software complexity measures. Halstead [14] uses a series 
of software science equations to measure the complexity 
of a program based on the lexical counts of symbols used. 
Generally, the measurements are made for each module, 
and the total measurements of the modules constitute the 
measurement of the program. Halstead’s metrics become 
available only after the coding is done, and therefore can 
be of use only during the testing and maintenance phases. 
Although Halstead’s metrics are useful in determining the 
complexity of programs, their weaknesses are that they 
do not measure control flow complexity and have little 
predictive value. 

Measures 9 and 10, i.e., VG\ and VG 2 , belong to Mc- 
Cabe and were adapted from the mathematical concepts 
of graph theory. McCabe cyclomatic complexity metric 
VG 1 is a measure of the maximum number of linearly in- 
dependent circuits in a program control graph. The pri- 
mary purpose of this metric is to identify software modules 
that will be difficult to test or maintain, as explained by 


McCabe [20]. The value of the McCabe metric is avail- 
able only after the detailed design is done. Although the 
McCabe metric is very useful for measuring control flow 
complexity, its weakness is that it is not sensitive to pro- 
gram size; for example, if programs of different sizes are 
composed exclusively of sequential statements, then they 
may have the same cyclomatic number. 

Measures 11-15 deal with the size of the program or 
the number of lines. Although many researchers do not 
find this measure as appealing, Boehm [3] points out that 
no other metric has a clear advantage over NCLOC as 
a metric. It is easy to measure, is conceptually familiar 
to software developers, and is used in most productivity 
databases and cost estimation models. 

Measure 16, the LB-ratio, is defined by Binder and 
Poore [2] as the ratio of the number of operators to the 
sum of the number of operands and the number of com- 
ments. It appears to capture the idea of distinguishing 
between meaningful comments in the code and just com- 
ments in general. The weakness of this metric is its re- 
liance on defining the number of meaningful comments, 
which seems to be more subjective than quantitative. 

B. Analysis of Data, Models, and Validation 

The 16 software measures of the three releases of the 
VRC code, OP-B, OP-C, and draft OP-D, w T ere analyzed 
using factor analysis, correlation, analysis of variance, and 
regression analysis. Table 4 shows the number of modules 
and the mean value per module for each of the 16 measures. 
Figures 3-5 show the correlation matrix of the 16 mea- 
sures for the three releases. The data show a high degree 
of correlation. Except for the LB-ratio measure, the re- 
maining 15 measures are highly correlated. It can be seen 
that the Halstead volume metric (V), the McCabe cyclo- 
matic complexity metric (VGi), and the NCLOC metric 
are highly and significantly correlated, while the LB-ratio 
metric is not. These results agree with those of other re- 
searchers, such as Ramamurthy and Melton [22], Gill and 
Kemerer [12], Samadzadeh and Nandakumar [25], Basili 
and Hutchins [1], Evangelist [9], and Kafura and Reddy 
[17]. 

The factor analysis matrix is shown in Table 5. All 
measures except the LB-ratio are loaded on factor 1, and 
thus there is no cross-loading. This is a desired result, 
since cross-loading on many factors makes the interpre- 
tation of the result ambiguous. The analysis of variance 
of the three sets of releases did not show any significant 
difference at the level of significance of 0.05. This means 
that, on the average, the values of, say, the McCabe cyclo- 
matic complexity metric (VGi) of the three releases are 
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not significantly different at alpha of 5 percent. The same 
is also true for the other 15 measures. 

Regression analysis had been used to develop models 
of relationships of the most interrelated measures. These 
are the Halstead volume metric (V r ), the McCabe cyclo- 
matic metric (V 7 Gi), and the non-commented lines of code 
( NCLOC ) metric, as discussed next. 

1. Factor Analysis Discussion. Three releases of 
software were analyzed by factor analysis to show the ex- 
istence of meaningful relationships among known software 
complexity measures. The analysis shows the number of 
factors where software complexity measures tend to load 
high or low, and also the percentage of the variability ex- 
plained by each factor. This research also shows the matrix 
of correlation summarizing the relationships among the 16 
software complexity measures for each release. 

Factor analysis of the three releases of software had 
shown that the first 15 measures of complexity are closely 
related to some measure of similarity and are consequently 
all interrelated. However, the 16th complexity measure 
(LB-ratio) does not seem to be typical of the other 15 
measures, and thus it is unlike the rest of the data set. 
The 3 releases show 2 factors that concisely state the pat- 
tern of relationships within the 16 measures. However, 
measures 1-15 load most strongly on the first factor with 
explained variability of 90 to 91 percent, while the sec- 
ond factor displays less interesting patterns with loading 
of 9 to 10 percent. Factor analysis had also shown that 
three complexity measures, the McCabe cyclomatic com- 
plexity metric (VG i), the Halstead volume metric (V 7 ), 
and (NCLOC), are highly and strongly related. There- 
fore, in order to achieve an economy of description, these 
three measures are considered to give a strong similarity 
and representation of all the 15 measures. 

The correlation matrix for each release of the software 
also shows that the first 15 complexity measures are re- 
lated, while the LB-ratio measure is not related or inter- 
related to any of the other 15 measures. 

Analysis of variance does not show any significant dif- 
ference between the three releases at the level of signif- 
icance of 5 percent. This means that as the software 
evolves through its releases, the interrelationships between 
the complexity measures seem to be preserved. However, 
we should note that without normalization to size, adding 
on to a program will make a more complex program. This 
seems to agree with the findings of other researchers, as 


discussed by Valett and McGarry [30], Harrison and Cook 
[15], and Schneidewind [26]. 

Since factor analysis techniques showed that the first 
15 software measures are closely related to some measure 
of similarity, and since 3 of these measures, the McCabe 
cyclomatic complexity metric (V r Gi), the Halstead volume 
metric (V 7 ), and the NCLOC metric, are highly and signifi- 
cantly related, they are considered to give a strong similar- 
ity and representation of all 15 measures. This economy of 
description made it appealing to develop a set of parsimo- 
nious models for software complexity measurements using 
data from the three software releases. The five composite 
models together with their coefficients of determination 
( R 2 ) are shown in Table 6. 

Statistical analysis, model back testing, and model test- 
ing with independent segments of software are used for 
validation of the composite models and ascertaining their 
degree of accuracy. The developed models had shown a 
high degree of accuracy in predicting software complexity, 
and thus they can serve as a baseline for other software 
projects in identifying software modules with high com- 
plexity (maintenance prone), so that actions can be taken 
before their release to users. 

2. Back Testing of Models. The five composite 
complexity models shown in Table 6 were checked with 
actual data from the three releases, OP-B, OP-C, and 
OP-D. Table 7 and Fig. 6 show the actual average values 
of the dependent variables (VG\) and values predicted by 
the first three models. Table 8 and Fig. 7 show the ac- 
tual average values of ( F) and values predicted by models 

4 and 5. It can be seen that the difference in predicting 
(V G i) by the first three composite models ranges from 3.2 
to 10.6 percent below the actual average value of (VG\), 
as calculated by the McCabe cyclomatic complexity met- 
ric. Also, the difference in predicting (V 7 ) by models 4 and 

5 ranges from 1.2 to 1.3 percent above the actual average 
value of ( V ), as calculated by Halstead’s volume metric. 

3. Testing the Five Composite Models by Exter- 
nal Check. The five composite complexity models were 
tested against four independent segments of software with 
characteristics as shown in Table 9. A sample calculation 
of actual average values of (V 7 Gi ) and values predicted by 
model 1 for the four segments of software is shown in Ta- 
ble 10. The summary of the actual grand average values 
of (VG\) and (T) and their values, as predicted by models 
1, 2, and 3 and models 4 and 5, respectively, for the four 
segments of software, is shown in Tables 11 and 12 and 
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Figs. 8 and 9. It can be seen that the difference in predict- 
ing (V G i) by the first three composite models ranges from 
17.3 percent below to 0.7 percent above the actual aver- 
age value of (VG\). Also, the difference in predicting (U) 
by models 4 and 5 is 9.7 percent above the actual average 
value of (V) for the four segments of software. 

C. Parsimonious Model and Relative Complexity 

Since the five complexity models developed in this 
study show direct relationships between (VG i) and (U) 
and also ( NCLOC ), we chose the third model, 

< VGl > = 0.786 + 0.0013(1/) + 0.0976(iVC , LOC) 

as a representative model for estimating the value of 
(VG\)y given the measured values of (V) and (NCLOC). 


the RCM has grown with each release, from a 2799 total 
in'OP-B to a 3470 total in the draft of OP-D. 

Using the criterion of the mean relative complexity 
value plus one standard deviation as a cut-off value for 
acceptable modules, we can identify those modules that 
can be considered as outliers, or maintenance-prone mod- 
ules. Results for the three releases are given in Table 14. 

In order to determine whether the modules above the 
cut-off value were more at risk to be modified for enhance- 
ment or fixes than modules below the cut-off value, the 
transitions between the releases were examined. The re- 
sults appear in Table 15. Of the 33 modules over the 
cut-off value of RCM in OP-B, 40 percent were actually 
modified in order to implement OP-C. Of the 36 modules 
in OP-C over OP-C’s RCM cut-off value, 50 percent were 
actually modified to implement the draft version of OP-D. 


1. Development of the Relative Complexity 
Metric. We propose to capture the total complexity of 
a program based on its control flow complexity, the lex- 
ical counts of symbols used, and the program size. In 
essence, a complexity metric that accounts for a program 
total complexity due to volume and control flow and nor- 
malized by the number of lines of code would present a 
relative complexity metric that is more useful to consider 
for detecting maintenance-prone programs. The relative 
complexity metric (RCM) will be derived for each module 
from the measured value of (P), the estimated value of 
(VG i) from model 3, and normalized by the module lines 
of code. The RCM for a module is 


(RCM)i = 


(<VG X > +V r 
V NCLOC 


2. Analysis of the Three Releases Using the Rel- 
ative Complexity Metric. The RCM was used to an- 
alyze the modules of the three releases, as shown in Ta- 
ble 13. Note that, as reported by Kafura and Reddy [17], 


Although the cut-off value seems to evenly divide the 
modules that were actually modified, the modules over the 
cut-off value for each release were more likely to be changed 
than the modules below the cut-off value. The RCM was, 
therefore, able to identify maintenance-prone modules. 


III. Discussion and Conclusion 

Given that a metric that measures software complexity 
should prove to be a useful predictor of software mainte- 
nance costs, it is recommended that modules that show a 
high order of complexity within a release be looked upon as 
modules with a propensity to become maintenance prone 
after release and delivery to users. It is imperative that 
a maintenance-prone module be improved, enhanced, or 
simplified into two or more modules before final delivery. 
The composite complexity models and the relative com- 
plexity metric developed in this study can be considered 
as a baseline for comparison with other projects and may 
serve as a set point for simplifying and reducing complex- 
ity of developed software. 
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Table 1. Percentage of time spent on various maintenance tasks. 


Maintenance tasks 


Percentage of time spent 

1977 

1985 

1987 

1990 

Enhancements 

59 

44 

41 

43 

Corrections 

22 

15 

18 

16 

Supporting users 

NA a 

21 

12 

12 

Reengineering 

NA 

NA 

10 

9 

Adaptations 

6 

8 

9 

8 

Documentation 

6 

NA 

5 

6 

Tuning 

4 

NA 

3 

5 

Evaluating requests 

NA 

8 

NA 

NA 

Other 

3 

4 

2 

1 

a Not applicable. 


Table 2. Comptroller General statistics on 
delivered software. 


Quality of 
software delivered 

Percentage of 
software delivered 

Could work on delivery 

2 

Could work after some rework 

3 

Never successfully put to use 

45 

Extensively reworked 

20 

Useless 

30 


Total 


100 



Table 3. Software measures used to analyze the VRC software. 


Measure 

number 

Measure 

Measure definition 

1 

n\ 

Number of unique operators 

2 

ri2 

Number of unique operands 

3 

N, 

Number of total operators 

4 

n 2 

Number of total operands 

5 

N 

Length (N\ + N 2 ) 

6 

N 

Estimated length = [ni (log 2 (ni )) + n 2 (log 2 (u 2 ))] 

7 

V 

Volume = (,/V)iog 2 (n) = (N\ + -N 2 )log 2 (ni + n 2 ) 

8 

E 

Effort = V/[(2/m){n 2 /N 2 )\ 

9 

VG 1 

McCabe cyclomatic complexity (number of decisions + 1) 

10 

vg 2 

Extended complexity (decisions + ANDs 4* ORs +1) 

11 

LOC 

Lines of code (includes blank and comment lines) 

12 

B/C 

Number of blank lines + number of comment lines 

13 

<;> 

Number of executable semicolons 

14 

SP 

Average maximum lines between variable references 

15 

NCLOC 

N on-commented lines of code = LOC — B/C 

16 

LB- ratio 

[Ni/(N 2 + B/C]\ 


Table 4. OP-B, OP-C, and OP-D modules and the mean values of 
the 16 measures, 


Measure 

number 

Measure 

OP-B (222 
modules) mean 

OP-C (224 
modules) mean 

OP-D (235 
modules) mean 

1 

ni 

12 

12 

13 

2 

n 2 

12 

12 

15 

3 

Ni 

70 

75 

87 

4 

n 2 

42 

44 

52 

5 

N 

113 

119 

140 

6 

N 

103 

110 

126 

7 

V 

704 

721 

844 

8 

E 

53,781 

58,198 

61,715 

9 

VG\ 

4 

4 

5 

10 

vg 2 

5 

4 

5 

11 

LOC 

73 

78 

83 

12 

B/C 

43 

46 

49 

13 

<;> 

12 

13 

15 

14 

SP 

5 

5 

6 

15 

NCLOC 

30 

31 

34 

16 

LB- ratio 

1 

1 

1 
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Table 5. The factor matrix for the 16 measures of OP-C, OP-B, and OP-D. 


Measure 

i Measure 

number 

OP-B 

OP-C 

OP-D 

Factor 1 

Factor 2 

Factor 1 

Factor 2 

Factor 1 

Factor 2 

1 

n\ 

0.78 

-0.17 

0.79 

-0.12 

0.78 

-0.17 

2 

n 2 

0.94 

-0.02 

0.94 

-0.02 

0.93 

-0.03 

3 

N, 

0.97 

0.10 

0.98 

0.83 

0.97 

0.08 

4 

n 2 

0.97 

0.06 

0.97 

0.04 

0.96 

-0.05 

5 

N 

0.98 

0.09 

0.98 

0.07 

0.97 

0.07 

6 

N 

0.91 

-0.01 

0.96 

-0.00 

0.96 

-0.01 

7 

V 

0.96 

0.14 

0.97 

0.09 

0.96 

0.09 

8 

E 

0.89 

0.22 

0.90 

0.15 

0.88 

0.15 

9 

VG\ 

0.94 

0.09 

0.95 

0.08 

0.93 

0.10 

10 

vg 2 

0.77 

0.12 

0.95 

0.07 

0.93 

0.10 

11 

LOC 

0.94 

-0.25 

0.96 

-0.17 

0.95 

-0.19 

12 

B/C 

0.61 

-0.64 

0.72 

-0.50 

0.70 

-0.53 

13 

<;> 

0.97 

0.03 

0.97 

0.04 

0.97 

0.06 

14 

SP 

0.70 

-0.05 

0.60 

-0.01 

0.72 

0.04 

15 

NCLOC 

0.98 

0.05 

0.98 

0.05 

0.98 

0.05 

16 

LB- ratio 

-0.03 

0.83 

-0.01 

0.92 

-0.02 

0.90 

Percentage 
of explained 
variability 

90 

10 

91 

9 

91 

9 


Table 6. Five composite complexity models and their coefficients 
of determination. 


Model 

number 

Model 

Coefficient of 
determination, percent 

1 

< VG\ > = 1.48 4- 0.005(F) 

R 2 = 96 

2 

<VGi> = 0.510 + 0.\36(NCLOC) 

R 2 = 96 

3 

< VG 1 > = 0.786 + 0.0013(17) + 0.0976(ATCLOC) 

R 2 = 96 

4 

<V>= -206 + 29.5 (NCLOC) 

R? = 99 

5 

<V>= —210 + 8.7(17Gi ) + 28.3{NCLOC) 

R 2 =99 


Table 7. Summary of actual average values of (PGi) and values predicted by models 1, 2, and 3. 


Model 

Release 


(P) value 

Delta, 

(A)-(P) 

Error percentage, 
delta/(A) 

Actual, (A) 

Predicted, (P) 

1 

OP-B 

4.45 

5.00 

-0.55 

-12.40 


OP-C 

4.53 

5.09 

-0.56 

-12.40 


OP-D 

5.30 

5.70 

-0.40 

-7.50 

Grand average 


4.76 

5.26 

-0.50 

-10.60 

2 

OP-B 

4.45 

4.59 

-0.14 

-3.10 


OP-C 

4.53 

4.86 

-0.33 

-7.30 


OP-D 

5.30 

5.27 

-0.03 

0.60 

Grand average 


4.76 

4.91 

-0.15 

-3.10 

3 

OP-B 

4.45 

4.62 

-0.17 

-3.80 


OP-C 

4.53 

4.84 

-0.31 

-6.80 


OP-D 

5.30 

5.30 

-0.00 

0.00 

Grand average 


4.76 

4.92 

-0.16 

-3.40 


Table 8. Summary of actual average values of (V) and values predicted by models 4 and 5. 


Model 

Release 

Actual, {A) 

(V) value 

Predicted, (P) 

Delta, 

(A) - (P) 

Error percentage, 
delta/(>l) 

4 

OP-B 

704 

679 

+25 

+3.6 


OP-C 

722 

738 

-16 

-2.2 


OP-D 

845 

826 

+ 19 

+ 2.2 

Grand average 


757 

748 

+9 

+ 1.2 

5 

OP-B 

704 

678 

+ 26 

+3.7 


OP-C 

722 

735 

-13 

-1.8 


OP-D 

845 

826 

+ 19 

+2.2 

Grand average 


757 

746 

-10 

+ 1.3 


205 



Table 9. Characteristics of four independent segments 
of software. 


Segment 

number 

Number of 
modules 

Actual average 

VGi V 

value 

NCLOC 

1 

16 

16.4 

3343 

102 

2 

16 

17.9 

4016 

139 

3 

50 

8.16 

1823 

64 

4 

55 

11.10 

2212 

71 


Table 10. Sample calculation of actual average values of (VG|) and values predicted 
by model 1 for segments 1-4. 

Model 

Segment 

(17) value 

Delta, 

(A)-(P) 

Error percentage, 
delta/( A) 

Actual, (A) 

Predicted, (P) 

1 

1 

16.40 

18.19 

-1.79 

-10.9 


2 

17.90 

21.56 

-3.66 

-20.4 


3 

8.16 

10.59 

-2.03 

-24.4 


4 

11.10 

12.54 

-1.44 

-13.0 

Grand average 


13.39 

15.72 

-2.33 

-17.3 


Table 11. Summary of actual grand average values of (VGi) and values predicted by 
models 1, 2, and 3 for segments 1-4. 


Model 

Segment 

(VG\) grand average value 

Delta, 

(A) - (P) 

Error percentage, 
delta/(^) 

Actual, (A) 

Predicted, (P) 

1 

1-4 

13.39 

15.57 

-2.33 

-17.3 

2 

1-4 

13.39 

13.31 

+0.08 

+0.6 

3 

1-4 

13.39 

13.48 

-0.09 

+0.7 
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Table 12. Summary of actual grand average values of (Vf) and values predicted by 
models 4 and 5 for segments 1-4. 




(VGi) grand average value 

Delta, 

Error percentage, 
delta/ (A) 

Model 

Segment 

Actual, (.4) 

Predicted, (P) 

(A)-(P) 

4 

1-4 

2848 

2570 

-1-278 

+9.7 

5 

1-4 

2848 

2571 

+277 

+9.7 


Table 13. Analysis of three software releases using the relative complexity metric. 

Release 

Total number 
of modules 



Relative complexity 



Total 

Maximum 

Minimum 

Median 

Mean 

Standard 

deviation 

OP-B 

222 

2799 

45 

0.4 

10.9 

12.6 

10.0 

OP-C 

224 

2837 

45 

0.4 

10.9 

12.7 

9.6 

OP-D 

235 

3470 

49 

0.4 

12.2 

14.8 

11.3 


Table 14. Cut-off values of the three software releases. 


Release 

Total number 
of modules 

(RCM) 

cut-off value 

Number of 
modules exceeding 

(RCM) 
cut-off value 

Percentage of 
modules over 

(RCM) 
cut-off value 

OP-B 

222 

22.6 

33.0 

15.0 

OP-C 

224 

22.3 

36.0 

16.0 

OP-D 

235 

26.1 

35.0 

15.0 


Table 15. Analysis of transitions between the three software releases. 


Transition 

Number of 
modules modified 

(RCM) 

cut-off value 

Percentage of 
modified modules 
over cut-off 
value 

Percentage of all 
modules over cut-off 
value that were 
actually modified 

From OP-B to 

13 

22.6 

46 

40 

OP-C 





From OP-C to 

38 

22.3 

47 

50 

OP-D 
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LBR -0.03466 -0.03778 0.01953 -0.00079 0.01181 -0.02969 0.01466 0.04023 0.01807 -0.02037 -0.15599 -0.33967 0.01323 0.01060 -0.00566 

0.6075 0.575 5 0.7723 0.9906 0.8611 0.6600 0.8281 0.5510 0.7889 0.7628 0.0201 0.000 1 0.8445 0.8752 0.9332 
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Fig. 4. Correlation matrix of 16 measures for OP-C. 
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Fig. 5. Correlation matrix of 16 measures for OP-D. 
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Fig. 7. Actual average values of ( V) and values predicted by 
models 4 and 5. 
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Fig. 8. Actual average values of ( VG^) and values predicted by 
models 1, 2, and 3 for independent segments of software. 



Fig. 9. Actual average values of ( V) and values predicted by 
models 4 and 5 for Independent segments of software. 




